[Bugfix] Fix crash when tool_choice=required exceeds max_tokens by chaunceyjiang · Pull Request #36841 · vllm-project/vllm

chaunceyjiang · 2026-03-12T03:19:50Z

Purpose

FIX #36794

Test Plan

see e2e

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a crash that occurs when tool_choice="required" is used with a max_tokens value too small for the tool call generation. The fix correctly handles potential JSON parsing errors from truncated model outputs by using a try...except block (via contextlib.suppress) and removing a problematic assertion. The changes in vllm/entrypoints/openai/engine/serving.py and vllm/entrypoints/openai/chat_completion/serving.py are sound. However, the new test case intended to verify this fix appears to be flawed, as it uses conflicting parameters that prevent it from testing the intended scenario. I've provided a suggestion to correct the test.

tests/tool_use/test_chat_completions.py

mergify · 2026-03-12T03:23:44Z

Hi @chaunceyjiang, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

mergify · 2026-03-12T03:29:51Z

Hi @chaunceyjiang, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

alvinttang

The fix correctly addresses the crash by replacing the hard with a graceful fallback — is the right idiom here. However, the change in uses which silently swallows any JSON parse errors other than a validation failure (e.g., malformed JSON from the model); it might be worth logging a debug/warning in the suppressed block so these failures aren't invisible. The added test is good but simultaneously passes both and — it would be cleaner to test just the path in isolation to make the failure mode unambiguous. Overall the core fix is sound and the behavior now matches OpenAI semantics.

alvinttang

The fix correctly addresses the crash by replacing the hard assert with a graceful fallback — tool_calls or [] is the right idiom here. However, the change in engine/serving.py uses contextlib.suppress(ValidationError) which silently swallows any JSON parse errors; it might be worth logging a debug/warning in the suppressed block so these failures aren't invisible in production. The added test simultaneously passes both max_tokens=5 and max_completion_tokens=150, which makes the failure mode ambiguous — testing just the max_tokens path in isolation would be cleaner. Overall the core fix is sound and the behavior now matches OpenAI semantics for this edge case.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

DarkLight1337 · 2026-03-12T06:25:39Z

tests/entrypoints/openai/test_completion_with_function_calling.py

+    )
+    # When `tool_choice="required"` and the tokens of `tools` exceed `max_tokens`,
+    # both `tool_calls` and `content` should be empty.
+    # This behavior should be consistent with OpenAI.


Have you confirmed that OpenAI does this as well?

chat_completion = client.chat.completions.create( messages=messages, model="gpt-5", tools=tools, tool_choice="required", max_completion_tokens=10, ) print(chat_completion)

ChatCompletion(id='chatcmpl-DIU1M3ic0iPTxtnxit59rWDxUpaEH', choices=[Choice(finish_reason='length', index=0, logprobs=None, message=ChatCompletionMessage(content='', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))], created=1773297678, model='gpt-5', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=10, prompt_tokens=340, total_tokens=350, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=10, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0), input_tokens=0, output_tokens=0, input_tokens_details=None))

Have you confirmed that OpenAI does this as well?

This is the result from my test with GPT-5.

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

…rmats When tool_choice="required" and the model produces non-JSON tool calls (e.g. XML from Qwen3 with qwen3_coder parser), both non-streaming and streaming paths now fall back to the configured tool_parser instead of silently dropping tool calls or failing. Non-streaming (engine/serving.py): - Replace contextlib.suppress(ValidationError) from vllm-project#36841 with try/except that preserves crash-safety (content or "") while adding fallback to tool_parser.extract_tool_calls() for non-JSON formats. Streaming (chat_completion/serving.py): - Initialize tool_parsers for "required" (not just "auto"). - Use separate if blocks (not if/else) so tool parsing runs in the same iteration when reasoning ends (critical for MTP/speculative decoding where </think> and tool call arrive in one chunk). - Dual parser: try tool_parser first (XML), fall back to JSON-only extract_tool_call_required_streaming() for non-deterministic MTP. Signed-off-by: voipmonitor <festr@voipmonitor.org>

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens

a477e4f

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

mergify bot added frontend tool-calling bug Something isn't working labels Mar 12, 2026

github-project-automation bot added this to Tool Calling Mar 12, 2026

chaunceyjiang marked this pull request as ready for review March 12, 2026 03:20

chaunceyjiang requested review from DarkLight1337, aarnphm and russellb as code owners March 12, 2026 03:20

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 12, 2026

gemini-code-assist bot reviewed Mar 12, 2026

View reviewed changes

tests/tool_use/test_chat_completions.py Outdated Show resolved Hide resolved

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens

1be8bdf

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens

377798e

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

alvinttang reviewed Mar 12, 2026

View reviewed changes

chaunceyjiang added 2 commits March 12, 2026 14:15

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens

6307876

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens

d253375

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang requested review from NickLucche and robertgshaw2-redhat as code owners March 12, 2026 06:22

DarkLight1337 reviewed Mar 12, 2026

View reviewed changes

DarkLight1337 approved these changes Mar 12, 2026

View reviewed changes

DarkLight1337 enabled auto-merge (squash) March 12, 2026 06:45

vllm-bot merged commit 5a71cdd into vllm-project:main Mar 12, 2026
44 of 47 checks passed

github-project-automation bot moved this to Done in Tool Calling Mar 12, 2026

chaunceyjiang mentioned this pull request Mar 12, 2026

[Frontend] Improve error message when tool_choice=required hits max_tokens #36883

Closed

3 tasks

voipmonitor mentioned this pull request Mar 13, 2026

fix: tool_choice="required" falls back to tool_parser for non-JSON formats (streaming + non-streaming) #35936

Open

chaunceyjiang mentioned this pull request Mar 17, 2026

[Bugfix][ResponsesAPI] Fix crash when tool_choice=required exceeds max_output_tokens #37258

Merged

5 tasks

will-deines mentioned this pull request Mar 17, 2026

[Responses API] Unified tool_choice + structured output via triggered tags will-deines/vllm#1

Closed

7 tasks

Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens (vllm…

ae6a563

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens (vllm…

ce2ca22

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens (vllm…

d16b244

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens (vllm…

8db9e37

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens (vllm…

d71b121

…-project#36841) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens#36841

[Bugfix] Fix crash when tool_choice=required exceeds max_tokens#36841
vllm-bot merged 5 commits intovllm-project:mainfrom
chaunceyjiang:required_max_tokens

chaunceyjiang commented Mar 12, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

alvinttang left a comment

Uh oh!

alvinttang left a comment

Uh oh!

DarkLight1337 Mar 12, 2026

Uh oh!

chaunceyjiang Mar 12, 2026

Uh oh!

chaunceyjiang Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

chaunceyjiang commented Mar 12, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

alvinttang left a comment

Choose a reason for hiding this comment

Uh oh!

alvinttang left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chaunceyjiang commented Mar 12, 2026 •

edited by github-actions bot

Loading